Phase transition in the economically modeled growth of a cellular nervous system 
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Spatially-embedded complex networks, such as nervous systems, the Internet and transportation 
networks, generally have non-trivial topological patterns of connections combined with nearly min- 
imal wiring costs. However the growth rules shaping these economical trade-offs between cost and 
topology are not well understood. Here we study the cellular nervous system of the nematode worm 
C. elegans, together with information on the birth times of neurons and on their spatial locations. 
We find that the growth of this network undergoes a transition from an accelerated to a constant 
increase in the number of links (synaptic connections) as a function of the number of nodes (neur- 
ons). The time of this phase transition coincides closely with the observed moment of hatching, 
when development switches metamorphically from oval to larval stages. We use graph analysis and 
generative modelling to show that the transition between different growth regimes, as well as its 
coincidence with the moment of hatching, can be explained by a dynamic economical model which 
incorporates a trade-off between topology and cost that is continuously negotiated and re-negotiated 
over developmental time. As the body of the animal progressively elongates, the cost of longer dis- 
tance connections is increasingly penalised. This growth process regenerates many aspects of the 
adult nervous system's organization, including the neuronal membership of anatomically pre-defined 
ganglia. We expect that similar economical principles can be found in the development of other 
biological or man-made spatially-embedded complex systems. 
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In the last decade or so there has been an abundance 
of studies demonstrating that superficially diverse sys- 
tems share important statistical properties [IHl]. Movie 
co-star networks, transport and communication systems, 
gene-gene interactomes, and many other natural and 
man-made systems have similarly complex topological 
features: they are generally efficient, small-world, mod- 
ular systems with a greater-than-random probability of 
highly connected nodes or hubs. Many but not all of 
these topologically complex systems are also spatially 
embedded [S]. For example, both the Internet and the 
World Wide Web have non-trivial topologies but only 
the Internet is physically instantiated as a network in 
a metric space. Spatially embedded networks generally 
increase in cost with increasing distance of connections 
between nodes; and this cost constraint must be traded- 
off against the functional advantages of topological fea- 
tures like hub nodes, robustness, and high global effi- 
ciency, that may add value but at greater than minimal 
cost |6] . Nervous systems share these general economical 
properties [7] : at all scales of space and time and in all 
species it is likely that brain networks are both parsimo- 
niously wired ^ and topologically complex [3]. 



* V.N. and P.E.V. have equally contributed to this work 
t To whom correspondence should be addressed. 
Email: etb23'9cam. ac.uk 



This was first demonstrated in the case of the net- 
work of neurons that comprises the nervous system of 
the nematode worm, Caenorhabditis elegans 01 En] • The 
brain of the hermaphrodite worm consists of 279 neur- 
ons (excluding the pharyngeal neurons), and is a sparse 
network (4% of maximum connection density) , with the 
majority of connections being between cells separated by 
short distances (< 10% of the overall body length of 
the adult worm). Both sparse connection density and 
low connection distance are as expected by the operation 
of a parsimonious drive to minimize wiring cost. How- 
ever, the wiring cost of the C. elegans connectome is not 
strictly minimized |llH13j : further reductions of connec- 
tion distance can be achieved by re-wiring the biological 
network in silico; but only at the expense of increasing 
the shortest topological path between neurons [M], thus 
reducing the overall system efficiency. To put it another 
way, it seems there is a trade-off between connection dis- 
tance and topological efficiency in the organization of the 
adult nematode worm's nervous system. Topological ef- 
ficiency is theoretically advantageous for globally integ- 
rated information processing and coordinated behaviors, 
but it is disproportionately expensive to engineer [71 [13] . 
It is arguable that such economical trade-offs between to- 
pological value and physical cost are likely to be a general 
selection pressure on formation of spatially embedded 
and topologically complex networks. More specifically. 



we predicted that economical principles applied dynam- 
ically over the course of developmental time (100s of mins 
after fertilization) could provide a reasonable account of 
the emergence of multiple observed features of the growth 
and adult configuration of the nematode's nervous sys- 
tem. 



RESULTS 

Here we investigate the growth of the C. elegans conn- 
ectome, from the moment of fertilization through hatch- 
ing of the egg and larval elongation to adulthood [I61IT7]. 
Importantly, we note that the physical distances between 
neurons increase as a function of the increasing overall 
length of the worm's body as it matures; see Figure [l^. 
The cells of the adult nervous system are concentrated in 
the head and the tail of the worm, with a series of neurons 
running along the length of the body to innervate local 
muscle groups (the ventral cord). This system can be 
decomposed into 10 ganglia (or neuronal groups) based 
on anatomical properties [HI [THj; see Figure [Tb. The 
birth times of each neuron tend to cluster in two time 
windows, separated by a "quiet" period which includes 
the time of hatching (800 minutes after fertilization) ; see 
Figure [Tfc. The developmental changes in the number of 
nodes [N) and edges {K) in the network occur in the con- 
text of progressive elongation of the worm's body, from 
less than 50fj,m before hatching to more than 1mm in the 
adult. 

The two growth spurts in neuronal number, before 
and after hatching, are paralleled by a roughly synchron- 
ous increase in the total number of synaptic connections 
between neurons (Figure flp) . However, the form of the 
relationship between N and K is evidently different be- 
fore and after hatching, as shown in Figure [ijl. The ini- 
tial increase in K is well described by a quadratic function 
of A'^, implying that the average node degree increases lin- 
early as the network grows (see inset). Then, at A^ ~ 200, 
hatching takes place, marking the metamorphic change 
of the worm from egg to larva. This event coincides with 
a discontinuous change in growth rules: after hatching, 
K increases linearly with N, so that the average node de- 
gree remains constant. This experimental evidence sug- 
gests that sharp qualitative changes can indeed affect the 
growth rules governing the development and the forma- 
tion of complex networks [TJ [31 [20]. In this case, the 
transition from one growth regime to another coincides 
with a metamorphic change of the worm, from egg to 
larva. 

While it is tempting to assume that it is a biological 
"trigger" or discontinuity associated with hatching that 
underlies the emergence of this biphasic growth curve, 
here we have assessed the ability of several simple and 
continuous models of network formation to reproduce 
this observed behavior without incorporating further bio- 



logical detail; see Figure [2] and Methods. We deliberately 
decided to restrict ourselves to stochastic one-parameter 
models. Firstly, because our aim was to isolate the fun- 
damental ingredients which could be responsible for the 
observed discontinuous growth; secondly because, as we 
show in the following, a one-parameter model was indeed 
enough to reproduce both the biphasic growth and many 
of the structural properties of the adult C. elegans neur- 
onal network. 

The first model we considered was the linear prefer- 
ential attachment model, introduced by Barabasi & Al- 
bert (BA) [20], which has been successfully employed 
to describe the development of many different complex 
networks, from the World Wide Web to the Internet 
and citation networks. The BA model assumes that the 
growth of a network is driven only by its topological prop- 
erties: specifically, newborn neurons are more likely to 
form connections to neurons that are already well con- 
nected. This model predicts a linear relationship between 
N and K, which matches closely the post-hatching phase 
of worm brain development but does not provide a satis- 
factory fit to the pre-hatching phase. Conversely, the 
binomial accelerated growth (BAG) model, which as- 
sumes that the probability of a connection between a new 
neuron and any pre-existing neuron is constant, predicts 
that K increases as a quadratic function of N ^21j . Simil- 
arly, we observe a quadratic dependence of fsT on A^ also in 
a modified version of accelerated growth (HAG), which 
additionally reproduces the node degree distribution of 
the adult worm. Accelerated growth models are thus 
able to reproduce the pre-hatching phase of the worm 
brain's growth but fail to accommodate the transition to 
linear scaling of K with A^ in the post-hatching phase. 

We found that economical trade-off models, that take 
into account the spatial location of neurons while allow- 
ing for some long distance connections to high degree 
nodes, were able to reproduce biphasic growth more ac- 
curately. As a first approximation, we defined the Eco- 
nomical Spatial Growth (ESG) model, which assumes 
that the probability of a connection forming between 
newborn neuron i and pre-existing neuron j is a product 
of the degree of the jth node in the adult nervous system, 
and a decreasing exponential function of the Euclidean 
distance d^ between nodes i and j in the adult worm. 
Although the modeled growth exhibits two phases, the 
transition between quadratic and linear phases occurs be- 
fore hatching. Therefore, we considered a more refined 
Economical Spatio- Temporal Growth model (ESTG), 
where dij is estimated by the Euclidean distance between 
neurons i and j at the time of birth of the newborn 
neuron, thereby adjusting for the fact that the connec- 
tion distance between any pair of neurons will be shorter 
at earlier stages of development before the worm becomes 
elongated. We extrapolated the position of each neuron 
during growth from its position in the adult worm, as- 
suming that each neuron's position was shifted along the 
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Figure 1. Development of the C. e/egans nervous system, a) C. eZegans reaches maturity roughly 63 hours after fertilization. During 
this time, its body-length increases from 50fj,m to llSOfim |221124) . b) In the adult hermaphrodite worm, more than 60% of the neurons 
are located in the head and about 15% are found in the tip of the tail (based on data modified according to 1171 . axis arbitrarily centred 
such that the origin is at the base of the head). Neurons are colored by ganglion membership |16| : anterior [A], dorsal [B], lateral [C], 
ventral [D], retrovesicular [E], posterior lateral [F], ventral cord [G], pre-anal [H], dorso-rectal [J], lumbar [K]. c) The total number of 
neurons {N, solid black), and connections {K, dashed blue), grows rapidly between 250 and 500 minutes after fertilization. Another burst 
of neurogenesis is observed at the end of the LI larval stage (using data from 1171 ). d) Plotting the number of synapses as a function of 
the number of neurons (yellow circles) reveals the presence of a phase transition. Before hatching, K grows as N'^ (solid blue line), while 
after hatching K grows linearly with A'^ (dashed green line). The inset shows the plot of the average nodal degree versus A^. 



longitudinal axis in proportion to the overall changes 
in body length (see Figure la) which we collated from 
the literature ^22] (for the pre-hatching stage) and [53] 
(after hatching) , using a linear interpolation between lar- 
val stages [Mj . While the penalty on connection distance 
remains fixed in this model, its effect on connectivity as 
a function of the overall scaling of the system is dynamic- 
ally evolving. Indeed, the trade-off between distance and 
topological degree is increasingly biased in favor of min- 
imizing connection distance as development proceeds and 
the worm becomes longer overall. The model provides an 
excellent fit to the two observed scalings of K as function 
of N in the biological data, including a good approxim- 
ation of the moment of hatching to the transition point 
from one growth regime to the other. 

This suggests that the discontinuity in the growth 
curve is not explained by biological triggers related to 
hatching but is instead a consequence of the spatial prop- 
erties of the system. In particular, the average distance of 
newly born neurons relative to all other neurons is much 
greater after hatching, so that the distance penalty term 
begins to dominate the trade-off embodied in the spa- 



tial growth rules. This is especially obvious in the ESTG 
model, where the worm's elongation causes distances to 
increase in the interim between the two bursts of neuro- 
genesis. Note, however, that a transition is already vis- 
ible in the ESG model. This can be explained by not- 
ing that most neurons born after hatching are located 
along the body of the worm rather than in the head (see 
Appendix Section SI and Fig. SI ), so that the aver- 
age distance between these newly born neurons and all 
others is again increased after hatching. We have also 
tested the ability of other one-parameter models to re- 
produce the observed growth curve (see Appendix Sec- 
tion S2); in particular, we tried to encode the cost of 
long connections through a power-law decay instead of 
an exponential one, but none of the alternative models 
was able to accommodate the abrupt change in the func- 
tional relation between K and N with the same accuracy 
obtained by ESTG (see Appendix Section S4, Table S-II 
and Fig. S2). 

The economical spatio-temporal growth model also 
provides a good account of several other features of the 
adult nervous system's organization, including the stat- 
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Figure 2. Modeling network growrth. a). The linear preferential attachment model (BA, blue squares) fails to reproduce the biphasic 
growth observed (solid line), b). In the binomial accelerated growth model (BAG; magenta squares) and the hidden-variable accelerated 
growth model (HAG; dashed blue line), the average node degree increases linearly with the size of the network, c). The economical 
spatial growth model (ESG; green squares) exhibits a biphasic behavior, yielding a transition from quadratic to nearly linear growth at 
^ < 180, but fails to capture the details of the observed growth, d). The economical spatio-temporal growth model (ESTG; red squares) 
accurately reproduces the details of the biphasic growth trajectory; for example, the inflection point of the modeled developmental curve 
corresponds closely to the moment of metamorphosis (hatching). The red dashed line in each panel indicates the number of nodes at the 
time of hatching (A'^ ~ 200). The standard error of each growth curve is smaller than the size of the symbols used to plot it, and is not 
reported. 



istical distributions of node degree, node efficiency, and 
edge length in the adult worm brain (see Figure Isl). Ac- 
cording to the results obtained through the computation 
of the Symmetrized Kullback-Leibler divergence, ESTG 
is the model which most closely reproduces the distribu- 
tions of node degree, edge length and node efficiency (see 
Appendix Section S5, Table S-Ill and Fig. S3, S4 and 
S5). 

Moreover, the model can provide a reasonable account 
of finer-grained details of the adult system, such as the 
anatomical variation in the average node degree and 
nodal efficiency along the length of the worm. Networks 
simulated by the model also had a mesoscopic structure 
which closely resembled the pattern of clustered con- 
nectivity between neurons belonging to one of 10 ganglia 
previously defined on biological grounds. Neurons be- 
longing to the same ganglion in the worm brain tend 
to have high connectivity with each other and relatively 
sparse connectivity to neurons in other ganglia [181 I19j . 
This biological pattern and the neurons belonging to each 
specific ganglion were quite accurately reproduced by the 
economical spatio-temporal growth model (Figure pi) . 

DISCUSSION 

We have shown that a fairly simple economical model 
was adequate to account for many aspects of the spatial 
and topological development of the nervous system of the 
nematode worm, C. elegans. We describe this generative 
model as economical because it represents the formation 
of synaptic connections probabilistically as a trade-off 
between topological value and wiring cost. More spe- 
cifically, the model accommodates the potentially com- 
petitive tendencies of each new neuron to connect to to- 
pologically important hub neurons, which may be a long 



distance away (^ 1mm), versus connecting only to neur- 
ons that are spatially adjacent (< 0.1mm), which will 
conserve wiring cost. Crucially, in estimating the con- 
nection cost between pairs of neurons we have used prior 
data on the birth time of each neuron and the progressive 
elongation of the worm's body to estimate the distance 
between each pair of neurons at the time of synapse form- 
ation. This measure of connection cost was traded-off 
against a topological bias (preferential attachment) for 
new neurons to connect to high degree hub neurons of the 
adult nervous system. As the worm's body progressively 
elongates, the cost penalty predominates and long dis- 
tance connections, even to hub nodes, become less likely. 
This simple but novel model of a dynamically evolving 
economical trade-off between cost and topology has al- 
lowed us to reproduce a phase transition in the growth 
of the C. elegans cellular connectome coinciding closely 
with the moment of hatching, or metamorphic transition 
from egg to larval stages of development. Dynamical eco- 
nomical growth processes also simulated several aspects 
of the configuration of the adult nervous system. 

The principle that nervous systems conserve wiring 
cost dates back to the seminal work of Ramon y Ca- 
jal in the 19th century and it has been experiment- 
ally validated and theoretically developed extensively 
since then. Many aspects of brain organization, ran- 
ging from the placement of neurons in the adult C. el- 
egans nervous system [S] , to the shape of dendritic trees 
|25) and the modular architecture of large-scale human 
brain networks [H], have been plausibly attributed to 
a parsimonious drive to minimize wiring cost. How- 
ever, a strictly cost-minimal network would have a reg- 
ular, lattice-like topology. Synaptic connections would 
be clustered between spatially and topologically neigh- 
bouring neuronal nodes, with none of the long distance 



axonal projections needed to mediate topologically effi- 
cient communication between widely separated neurons. 
But this is not a recognisable description of nervous sys- 
tem topology. In many species, and at many scales of 
space and time, it has been found that brain structural 
and functional networks have shorter average path length 
or greater efficiency than a regular lattice. Brain net- 
works also consistently have non-regular properties like 
high-degree hubs in a fat-tailed degree distribution, and 
a modular community structure entailing long distance 
inter-modular connections between neurons in anatom- 
ically distributed modules. Many of these topological 
features are more than minimally expensive or incur a 
premium in wiring cost; but they can add value to the 
overall performance of the system. For example, high- 
degree hub nodes of the C. elegans nervous system in- 
clude many of the so-called command interneurons which 
play a key role in the adaptive function of coordinated 
forward and backward movement of the worm [151 I16j . 
Topological efficiency of human brain networks has been 
positively correlated with normal variation in IQ (more 
intelligent people tend to have more efficient structural 
and functional networks) O [37] . Trade-offs between cost 
and efficiency have been shown to be heritable proper- 
ties of human brain networks derived from functional 
magnetic resonance imaging (fMRI) data j28^; and eco- 
nomical models of network formation can reproduce the 
(somewhat different) statistical properties of fMRI net- 
works in both healthy adults and patients with schizo- 
phrenia [29]. These and other observations support the 
general idea that nervous systems are selected to negoti- 
ate an economical trade-off between wiring cost (usually 
measured by connection distance) and topological value 
(which could be measured by degree, efficiency or a num- 
ber of other network properties related to adaptive brain 
function). 

So the basic principles of the economical model invest- 
igated here are not new to the neuroscience literature [7] . 
However, there are several distinctive aspects of our res- 
ults. Firstly, this work is an innovative demonstration 
that economical models can account for the growth of a 
nervous system described quite concretely and exactly at 
the cellular scale of synaptic connections between neur- 
ons. Many of the previous studies of economical trade- 
offs in brain networks have been based on analysis of 
statistical associations (so-called functional connectivity) 
between fMRI time series recorded at different spatial 
locations [28]; or on analysis of large-scale axonal pro- 
jections rendered by tractography algorithms applied to 
diffusion imaging data [3D]. Such human neuroimaging 
results indicate that economical principles may apply to 
network formation at macroscopic scales, but the neur- 
onal substrate of networks based on imaging statistics 
remains unresolved. The demonstration here of econom- 
ical principles applying to a connectome described with 
much greater precision at a cellular scale somewhat valid- 



ates the prior neuroimaging results. Moreover, it suggests 
that the same competitive selection criteria may inform 
nervous system formation over multiple spatial scales. 
Brain networks may have a scale-invariant or fractal eco- 
nomy of organization. 

More broadly, these results are innovative in demon- 
strating directly how simple economical growth mod- 
els can provide a reasonable account of complex growth 
curves, such as the non-linear processes of nervous sys- 
tem maturation and metamorphosis, from egg to adult 
worm. Nematodes, like all superphylum Ecdysozoa, de- 
velop through discrete stages (egg, several juvenile stages, 
adult) separated by moulting events. The situation in 
C. elegans is most closely analogous to hemimetabolous 
insects (with "incomplete metamorphosis") since the ju- 
venile stages resemble the adults apart from the absence 
of mating/reproductive structures. However, each moult 
can be considered metamorphic, with the L1-L2 and L4- 
adult moults in particular known to involve both the ad- 
dition of new cells and formation of new synaptic links. 
The special significance of the egg-Ll transition has per- 
haps been less appreciated up to now, and as such rep- 
resents a unique finding of this work 

Our more realistic modelling of connection cost, taking 
into account the changing spatial constraints during the 
growth of the system, also shines a different light on the 
many previous studies of connection cost [3 [TTJ [16] in 
this paradigmatic complex system. Further work will be 
needed to test the hypothesis that the specific parameters 
of this model correspond to discrete molecular or genetic 
signals. It is imaginable, for example, that a penalty on 
long distance connections could be biologically coded by 
the spatial gradient of an axonally attractive molecule 
diffusing from neurons; or that neurons destined to have 
high degree in the adult system express distinctive cell 
surface markers from birth that favor synaptic formation. 

We have compared the performance of the dynamically 
evolving economical model to that of a number of other 
models and found, as expected theoretically, that simpler 
models based on preferential attachment rules could re- 
produce one or other of the two phases (quadratic or lin- 
ear) of network development pre- or post-hatching. How- 
ever, only economical models which traded-off connec- 
tion distance versus preferential attachment bias could 
reproduce both phases and the timing of phase trans- 
ition was only accurately reproduced by the dynamic 
linkage between inter-neuronal connection distance and 
progressive developmental elongation of the whole organ- 
ism. For this reason, we consider that the modelling res- 
ults affirm our hypothetical prediction that development 
of this cellular connectome can be accounted for by con- 
tinual re-negotiation of an economical trade-off between 
connection cost and the formation of high degree hubs. 
This affirmation is conditional on the caveats that not all 
possible models have been comparatively evaluated. It is 
possible that a better model, perhaps incorporating a few 
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Figure 3. Local and mesoscopic network structures, a). The distributions of node degree (Left, blue), connection distance (Center, 
red) and node efficiency (Right, orange) of model-generated networks closely match those observed in the C elegans neuronal network 
(shown in gray), b). This panel shows how the average node degree (Upper) and the average node efficiency (Lower) vary along the length 
of the C elegans body (solid black lines) and in networks generated using the ESTG model (red dashed lines), c) Networks created using 
the ESTG model (right- panel) also reproduce the pattern of intra- and inter-ganglia connections observed in C. elegans (Left). Brighter 
colors indicate higher connection density; letters A-K denote neuronal ganglia as defined in legend to Figure IT] 



more relevant biological details (such as type of synapse, 
electrical or chemical), could be developed in future. 

We note that economical principles of network form- 
ation demonstrated here for the growth of the nervous 
system of the nematode worm are not necessarily lim- 
ited to this system. Many other systems, besides brains, 
are both spatially embedded and topologically complex. 
We anticipate that economical growth models of the po- 
tentially changing trade-offs between physical connection 
cost and topological value may also contribute to future 
understanding of the development and evolution of trans- 
port, computational and infrastructural systems. 



MATERIALS AND METHODS 

Data. We have used the most up-to-date map of the C. 
elegans connectome [TB] , consisting of 279 somatic neur- 
ons interconnected through 6393 chemical synapses, 890 
gap junctions and 1410 neuromuscular junctions. Since 
gap junctions often overlap with synapses and synaptic 
connections are often reciprocated, we have considered 
only the backbone network, where all the synapses and 
gap junctions between each pair of neurons are represen- 



ted by a single undirected edge, obtaining a graph with 
iV = 279 nodes and K = 2287 edges in total (neuromus- 
cular connections were excluded). Information about the 
growth of the neuronal network, in particular on the ex- 
act time of birth of each neuron, has been reconstructed 
from recent literature |17j . 

Linear Preferential Attachment. The Barabasi and 
Albert (BA) model assumes that the growth of a net- 
work is solely driven by its topological structure, and 
produces random graphs with a power-law degree dis- 
tribution pk ~ fc"''', where 7 ~ 3 [20] . In the model, a 
new node is added at each time and is connected to m 
existing nodes. The probability for the new node i to be 
connected to an existing node j is a linear function of the 
degree kj, namely: 



n 



BA 



kj/2K 



(1) 



where K denotes the total number of links when the new 
node arrives. Since each node chooses m neighbors to 
connect with, the total number of links increases linearly 
with the size of the network, and the average node degree 
remains constant. 

Accelerated Topological Growth. Traditionally, net- 
work growth is said to be accelerated if the average node 



degree increases with the size of the network. Acceler- 
ation has been observed in many complex networks and 
different models of scale-free networks with acceleration 
have been proposed so far [21]. We have considered two 
different accelerated growth models. In the first model, 
called Binomial Accelerated Growth (BAG), a new node 
i tries to establish a connection with each of the existing 
nodes, and a link to node j is created with probability p, 
namely: 



nodes is created with probability: 



jjBAG 



P 



(2) 



The BAG model produces networks in which the number 
of links increases as the square of N. In fact, the expec- 
ted number of links established when the network has N 
nodes is: 



N 



K{N)^pJ2{^-l) 



P- 



N{N-1) 



(3) 



i=l 



The BAG model produces networks with a binomial de- 
gree distribution, since it is equivalent to an Erdos-Renyi 
random graph model, where each of the N{N — l)/2 po- 
tential links appears with probability p [31] . 

We introduced also a second model of acceler- 
ated growth, called Hidden- variable Accelerated Growth 
(HAG). In general, hidden- variable models produce net- 
works with a prescribed degree distribution: the HAG 
model grows random networks having - on average - the 
same degree distribution observed in the adult C. elegans 
neural network. The model works as follows. We assign 
to each node j of the network, once and for all, a hidden 
variable hj. In particular, we set hj = kj^ , where kj^ 
is the degree of node j in the adult worm. When a new 
node i arrives, it tries to establish a link with each of the 
nodes in the network, and a link to node j is created with 
probability: 



H 



HAG 



Pi 



(4) 



where hmax is the maximum of hj over j , and p is appro- 
priately selected in order to reproduce the final number 
of links. It is possible to prove that the final degree of 
a node i over different network realizations is Poisson 
distributed around an average value equal to ki^ . Con- 
sequently, networks produced by HAG show an acceler- 
ated growth similar to that generated by the BAG model, 
while also preserving the actual degree distribution of the 
C. elegans neural network. 

ESG. To create networks embedded in Euclidean 
space E] , we considered the economical spatial growth 
model, which is based on a trade-off between the tend- 
ency to create topologically important connections to 
hubs and the physical distance between neurons. When 
a new node i arrives, it is placed in the position it occu- 
pies in the adult worm, and a link to each of the existing 
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where the values hj are assigned as in the HAG model 
and J is a parameter tuning the typical connection dis- 
tance. Namely, the probability of creating a link expo- 
nentially decreases with the Euclidean distance df^ that 
separates i and j in the adult worm, and is weighted by 
the hidden variable hj = kj^ (in order to preserve the 
actual degree distribution of the C. elegans neural net- 
work) . 

ESTG. The Economical Spatio- Temporal Growth 
model, using information about the length of the worm 
at different stages, takes into account the actual spatial 
position of each neuron, while the worm grows over time. 
When a new node i arrives, it is placed in the position 
it occupies in the C. elegans neural network at time i, 
and a link to each of the existing nodes is created with 
probability: 



H 



ESTG 



hr. 



5 



(6) 



where the values hj are assigned as in the HAG model, 
and 5 is a parameter tuning the typical edge length. No- 
tice that the probability to establish a link depends on 
the time at which node i appears, since the distance dij (t) 
depends on the relative positions of i and j, which change 
over time due to elongation of the worm's body. We con- 
sidered the real length of the worm at each time, and we 
estimated the position of each node at that time using 
linear interpolation and assuming a uniform expansion of 
the worm along the longitudinal axis. 
Parameter Tuning. The first requirement of any suit- 
able model for the C. elegans neuronal network growth 
is to produce networks having N = 279 nodes and, on 
average, K = 2287 edges, as observed in the adult worm. 
We used Monte Carlo simulations and iterative bisection 
to identify the interval in the parameter space for which 
the expected total number of edges K of the generated 
networks was equal to 2287 ± 1%; see Appendix Section 
S3 for methodological details and the optimal parameter 
values for each of the eight models in Appendix Table S-I. 
Degree Distribution. Given an undirected graph 
G{V, E) associated with the symmetric adjacency matrix 
A = {aij}, the degree of a node i is defined as the num- 
ber of edges incident on i, and is denoted by fcj = ^ Uij. 
The degree distribution P{k) of the graph indicates, for 
each value of k, the probability of finding a node whose 
degree is equal to k. 

Connection Distance Distribution. Given two dir- 
ectly connected nodes i and j of a spatially-embedded 
network, we define the distance of the edge (i, j) as the 
Euclidean distance dij separating node i and node j. The 
distance distribution P{d) is the probability of finding an 
edge whose distance is exactly equal to d. 



Node and Graph Efficiency. Given an undirected and 
unweighted graph G, the efhciency of a node is defined 
as: 



E,= 



N 



1 ^ 1 

— Y- 



(7) 



3 = 1 



where \ij is the path length between node i and node 
j, measured as the number of edges in the shortest path 
connecting i to j. The smaller Xij, the larger the contri- 
bution of node j to the efficiency of i. The efficiency of 
a graph is defined as the average efficiency of its nodes. 
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Section SI. Location of neurons born after hatch- 
ing 

In Fig. |S-1| we show the spatial configuration of neur- 
ons before and after hatching. Notice that the majority 
of the neurons born before hatching are concentrated in 
the head and in the tail region, while most of the neur- 
ons appearing after hatching are instead placed in the 
body to form the ventral cord. This explains the relat- 
ive higher distance from newly added neurons to existing 
ones observed after hatching. 

Section S2. Additional one-parameter models 

We present here three additional growth models which 
have been tested during this study, namely the Simple 
Spatial Growth (SSG), Spatial Growth with Elongation 
(SGE) and Power-law Economical Growth (PEG). We 
also discuss their ability to reproduce the developmental 
growth of the C. elegans neuronal network, and we will 
compare them with the other five models described in 
the main text, i.e. Barabasi- Albert (BA), Binomial Ac- 
celerated Growth (BAG), Hidden- variable Accelerated 
Growth (HAG), Economical Spatial Growth (ESG) and 
Economical Spatio- Temporal Growth (ESTG). Notice 
that all the models considered in this study have only 
one free parameter. Nevertheless some of these models, 
and in particular the ESTG, are exceptionally accurate 
at reproducing the structure and development of the C. 
elegans neuronal network. 

Simple Spatial Growth (SSG). This model makes the 
assumption that upon arrival a new node i is placed in 
the same position at which it appears in the adult worm. 
Then, node i creates an edge to each of the already ex- 
isting nodes j with probability: 

nf4.=e-t (S-1) 

where d°'f' is the distance between node i and node j in 
the adult worm and (5 is a parameter tuning the typ- 
ical edge length. Since the connection probability de- 
creases exponentially with the distance between nodes in 
the adult worm, the resulting networks exhibit very few 
medium- and high-distance links, which are instead rel- 
atively frequent in the real G. elegans neuronal networks. 
Spatial Growth with Elongation (SGE). This model 
uses information about the length of the worm at dif- 
ferent stages. The node i arriving in the network at time 
t is placed in the position it occupies in the neural net- 
work at that time, and the probability for i to connect 
to an existing node j is defined as: 

nf_^f = e-^ (S-2) 

where dij{t) is the distance between node i and node 
j at time t and (5 is a parameter. Notice that dij{t) 
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Figure S-1. Position of neurons born after hatching. The large majority of neurons born after hatching are located 
throughout the worm's body, while most of the neurons born before hatching are concentrated in the head and in the tail. The 
X-axis represents the distance in millimeters from the base of the head. Positive values indicate points in the worm's head, 
while negative values correspond to the body and the tail. 



is a function of time, so that the probability to create 
an edge between a newly arrived node i and an existing 
node j depends on the time at which node i arrives in the 
network and on the relative positions of i and j at that 
time. This makes possible the creation of edges between 
nodes which are actually separated by a relatively large 
distance in the adult worm but have been closer in space 
in earlier developmental stages. 

Power-law Economical Growth (PEG). This model im- 
plements a trade-off between the tendency to create edges 
to hubs and the relative distance of the nodes, and takes 
into account the elongation of the worm during develop- 
ment. Differently from the Economical Spatio- Temporal 
Growth model presented in the main text, in which the 
connection probability is a decreasing exponential func- 
tion of distance, in this model the probability to connect 
to a distant node decreases as a power-law: 



n 



PEG 



1 



%(0 

Lt 



(S-3) 



Here, hj is the hidden degree of node j, which is set equal 
to the degree of node j observed in the adult worm, while 
hmax is maximum node degree in the adult neural net- 
work. As for the ESTG model, dij{t) is the distance 
between node i and node j in the worm at time t. Lt is 
the total worm length at time t and a is the exponent of 
the power-law. Notice that the attachment probability 
n-^^*^ approaches when the distance dij (t) is compar- 
able with the length of the worm, while the hidden degree 
of the destination node plays a more important role if the 
two nodes are closer in space. Thanks to the preferen- 
tial attachment term, based on the hidden degree of the 
nodes, this model tends to preserve the degree distribu- 



tion of the original network. 

Section S3. Parameter tuning 

In this study we considered only onc-paramctcr ran- 
domized growth models. In general, a randomized model 
generates an ensemble of graphs having certain character- 
istics. If the model has a tunable parameter, each value 
of the parameter generates a family of graphs sharing 
similar structural properties. For instance, the Binomial 
Accelerated Growth (BAG) model produces networks in 
which the number of edges grows quadratically with the 
number of nodes, but the expected number of edges in the 
final network, i.e. when the number of nodes is equal to 
N — 279, depends on the actual value of the attachment 
probability p. 

Since a randomized one-parameter model generates a 
family of graphs for each value of the parameter, its abil- 
ity to reproduce the structure of a given network cannot 
be assessed through a direct comparison of the original 
graph with a single realization of the model. Instead, 
the comparison should be performed by taking into ac- 
count the expected structural properties of the ensemble 
of networks generated by the model, for each value of 
the parameter, averaging over a sufficiently large num- 
ber of realizations. The first requirement of any suitable 
model for the G. elegans neural network growth is to pro- 
duce networks having N = 279 nodes and, on average, 
K = 2287 edges. This constraint has been used to find 
the optimal parameter of each considered model. 

We employed a two-step parameter optimization pro- 
cess. In the first step we used a Monte-Carlo approach 
to identify the interval in the parameter space for which 
the expected total number of edges K of the generated 
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networks was equal to 2287 ± 5%. In this step, we con- 
sidered 20 networks for each value of the parameter. In 
the second step we iteratively shrunk the parameter in- 
terval using the bisection method, in order to identify the 
value for which the difference between K and K = 2287 
was smaller than 1%. In this step we generated 500 net- 
works for each value of the parameter. The optimal para- 
meter values for each of the eight models are reported in 
Table [O 



and we considered the expected value ^[^{N)] and the 
standard deviation cr[^(iV)] of S,{N). In Table 



S-II 



Model 


Optimal parameter 


BAG 


p = 0.0575 


HAG 


p = 0.302 


BA 


mo = 8, jn = 8 


SSG 


5 = 0.01365 


SGE 


5 = 0.00235 


PEG 


a = 0.0232 


ESG 


5 = 0.0858 


ESTG 


5 = 0.0126 



Table S-I. Optimal model parameters. The optimal para- 
meter of a model guarantees the generation of networks hav- 
ing the same number of edges as the C. elegans adult neural 
network, with an error smaller than 1%. 



Section S4. Model comparison 

Since our aim was to reproduce as closely as possible 
the developmental growth of the C. elegans neuronal net- 
work, and in particular the abrupt transition in the num- 
ber of edges in the graph as a function of the number of 
nodes, we defined a measure to quantify how closely each 
model matches the curve IC{N), which indicates the num- 
ber of edges in the C. elegans neuronal network when N 
nodes have been born. 

We denote by JCm{N) the family of curves of K over 
N obtained using a certain model M and setting the 
value of the model parameter according to Table |S-I| 
We computed, for each value of N, the expected number 
IJ-iK-MiN)) of edges in the network generated by model M 
when N nodes have been added to the graph, averaging 
over 500 realizations. Using this notation, ^(/Cm(IOO)) 
is the expected number of edges in the graphs generated 
by model M when the first N = 100 nodes have been 
added to the graph. 



In Fig. S-2 we report the average curve /i(/CA/(A^)) for 
each of the eight considered models, together with the 
original curve }C{N) corresponding to the growth of the 
C. elegans neural network. By visual inspection, we con- 
clude that the model which best fits the developmental 
growth of the original network and the phase transition 
at hatching is the Economical Spatio- Temporal Growth. 
In order to quantify the discrepancy between }C{N) and 
K,m{N) we computed, for each model and for each value 
of A^, the difference ^(iV): 



report the values of ^A,i,{N)\ and cr[^(Af)] for the eight 
models considered. In general, smaller values of /i[^(iV)] 
and cr[f(7V)] indicate a closer match of the original 
growth curve. In agreement with the conclusions drawn 
after visual inspection of Fig |S-2[ which suggested that 
ESTG was the model which most closely reproduced the 
growth curve, the smallest values of im[S,{N)] and cr[^(Af)] 
are indeed obtained by the Economical Spatio- Temporal 
Growth model. The networks generated by all the other 
models fail to follow the original growth curve by a large 
extent, and they consequently exhibit larger values of 
Ai[C(iV)]andaK(iV)]. 



Model 


MK(iV)] 


o[i[N)] 


BAG 


154.2 


123.7 


HAG 


154.2 


123.7 


BA 


216.7 


150.7 


SSG 


205.2 


167.1 


SGE 


89.5 


73.7 


PEG 


209.4 


168.4 


ESG 


215.6 


172.9 


ESTG 


37.3 


311 



e(iV) = |/C(7V)-A^(/CAf(iV))| 



(S-4) 



Table S-II. Quality of growth fit. Average and standard 
deviation of the point-to-point difference between the ob- 
served growth curve IC{N) and the average curve corres- 
ponding to each of the eight considered models. The model 
parameters are set according to Table |S-I[ The smaller the 
value of iJ.[(,{N)], the more closely a model can reproduce the 
growth of the C. elegans neural network. The Barabasi- Albert 
model (BA) exhibits the highest average point-to-point dis- 
tance, while the Economical Spatio- Temporal Growth model 
(ESTG) largely outperforms all the other models. 



Section S5. Node degree, edge length and node 
efHciency 

Here we compare the structure of the networks pro- 
duced by each of the eight models described in this study 
with that observed in the adult C. elegans neural net- 
work, by using three classical network metrics. The first 
metric is the degree distribution. Given an undirected 
graph G{V, E) associated with the symmetric adjacency 
matrix A = {fly}, the degree of a node i is defined as 
the number of edges incident on i, and is denoted by 
ki = ^ aij. The degree distribution P{k) of the graph 
indicates, for each value of fc, the probability of finding 
a node whose degree is equal to k. The second metric 
is the distribution of connection distances. Given two 
directly connected nodes i and j of a spatially-embedded 
network, we define the distance of the edge {i,j) as the 
Euclidean distance dij separating node i and node j. The 
associated distance distribution P{d) is the probability of 
finding an edge whose distance is exactly equal to d. The 
third metric is node efficiency. Given an undirected and 
unweighted graph G, the efficiency of a node is defined 
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Model 


Dkl {P(k) 


PM{k)) 


Dkl {P{d) 


PM{d)) 


Dkl (P(£0 


Pm{E,))\ 


BAG 

HAG 

BA 

SSG 

SGE 

PEG 

ESG 

ESTG 


^^^^■^^^^H 


0.346 
0.290 


0.966 


0.301 
0.309 


0.545 
0.226 
0.611 


0.176 




n ofto 


0.149 


0.322 


0.214 


0.708 


'^^^ 0.685 


0.361 


^^^^^U.143 


0.099 


0.223 



Table S-III. Kullback-Leibler divergence. The symmetrized KuUback-Leibler divergence between the degree, edge length 
and node efficiency distributions of the adult C. elegans neural network and the corresponding average distributions of the 
networks generated through each of the eight models. Smaller values of symmetrized divergence indicate higher similarity 
between the two distributions. The best and second-best values are highlighted in green and yellow, respectively, while the 
worst and second-worst are marked in red and orange, respectively. BAG and SSG exhibit the worst values of divergence. 
Interestingly, besides being the best model at fitting the developmental growth of the C. elegans neural network (as shown 
in Fig. S-2 and in Table S-II I ESTG performs more consistently than any of the other models in reproducing the structural 



properties of the adult worm's nervous system. 



as: 



E,^ 



N 



N^l^ X,, 



(S-5) 



3 = 1 



where Xij is the distance between node i and node j, 
measured as the number of edges in the shortest path 
connecting i to j. The node efficiency of i measures how 
easy it is to reach any other node in the graph by start- 
ing from i and traveling across shortest paths. In general, 
the smaller the distance between i and j, the higher the 
contribution of j to the efficiency of node i. If the graph 
is not connected and node i and j belong to two different 
connected components then there exists no path connect- 
ing them. In this case, the distance Xij is conventionally 
set to cxD, and the contribution of node j to the cfhciency 
of i is equal to l/oo = 0. 



In Fig. S-3 S-4 and S-5 we show, respectively, the av- 



erage degree distribution, length distribution and node 
efficiency distribution of the networks generated by each 
of the eight models, together with those observed in the 
adult C. elegans neural network (reported in each panel 
in shaded grey). By visual inspection, we notice that 
ESTG seems to be the model which most closely repro- 
duces all these distributions. 

In order to quantify the difference between the distri- 
butions of degree, length and node efficiency of synthetic 
graphs with those of the C. elegans neural network we 
used the Kullback-Leibler divergence. Given two probab- 
ility distributions P — {pi} and Q = {qt}, the Kullback- 



Leibler divergence of Q from P is defined as: 
Dkl{P\\Q)=J2p^^''S- 



(S-6) 



The Kullback-Leibler divergence measures the informa- 
tion lost when Q is used as an approximation of P, and is 
non-symmetric, i.e. Dkl{P\\Q) ¥" Dkl{Q\\P)- Since we 
are interested in measuring the similarity between two 
distributions, and not the relative information lost when 
using one of them as a predictor of the other, we opted 
for the symmetrized Kullback-Leibler divergence, which 
is defined as follows: 



Dkl{P,Q) = 



DKL{P\\Q) + DKLmP) 



(S-7) 



In general, the smaller the value of Dkl{P,Q), the 
more similar the two distributions P and Q. If we de- 
note by P{c) the distribution of the generic quantity 
c in the C. elegans neural network and by Pm{c) the 
distribution of the same quantity c in networks gen- 
erated through model M, the symmetrized Kullback- 
Leibler divergence between P{c) and Pm{c) is denoted 
as Dkl{P{c),Pm{c)). In Table S-III we report, for 



each model, the values of the symmetrized Kullback- 
Leibler divergence between the degree, edge length 
and node efficiency distributions of the adult C. el- 
egans neural network and the networks generated by 
each of the eight models, which are respectively de- 
noted by DKL{Pik),PMik)), DKLiP{d),PM{d)) and 
DKL{P{Ei),PM{Ei)). The best and the second-best 
value of Dkl{P,Q) for each metric are highlighted in 
green and yellow, respectively. Notice that the smallest 
values of the symmetrized Kullback-Leibler divergence 
are consistently obtained by the ESTG model, with the 
only exception being node efficiency for which PEG out- 
performs ESTG by a small amount. 
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Figure S-2. Growth curves. The average total number of edges ICm{N) as a function of A'^ for each of the eight models. The 
original growth curve of the C. elegans neural network is reported for reference in each panel, as a solid black line. The SSG, 
SGE, ESG and ESTG models exhibit a transition from a quadratic to a linear increasing regime, but only ETSG is able to 
closely match the growth curve observed in the original graph. 
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Figure S-3. Degree distributions. The average degree distribution of tiie networks generated by eacli of tlie eight models. 
The degree distribution of the aduh C. elegans neural networks is reported in each panel in shaded gray, for comparison. Only 
the models based on hidden-variables, i.e. HAG, PEG and ETSG, are able to reproduce the degree distribution of the worm 
more closely. In the BA, SSG and ESG models low-degree nodes are over-represented, while in the BAG and SGE models 
low-degree nodes are substantially under-represented. 
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Figure S-4. Edge length distributions. The average distribution of edge length in the networks generated by each of the 
eight models, compared with the distribution of edge length observed in the adult C. elegans network (reported in shaded gray). 
BAG, HAG, BA and PEG produce networks with substantially longer links, while SSG, SGE and ESG exhibit a substantially 
larger percentage of short links (notice the different scale of the y-axis in the SSG panel). The only model which closely matches 
the distribution of edge length is ESTG. 
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Figure S-5. Node efficiency distribution. The distribution of node efficiency of networks generated witii each of the 
eight models, compared with that observed in the adult C. elegans (shaded gray). Both BAG and HAG produce binomial 
distributions of edge efficiency; for SSG, SGE and ESG models the distribution of efficiency is skewed towards smaller values 
while BA is able to capture the peak around Ei = 0.47. PEG and ESTG reproduce the original distribution in a more balanced 
way, even if nodes with efficiency around 0.5 are substantially over-represented while the peak around 0.47 is missing. 



